Deriving a strategy for synthesizing lengthening disfluencies based on spontaneous conversational speech data
نویسندگان
چکیده
Our overarching research project explores the usability of disfluencies in incremental spoken dialogue systems. This endeavor requires basic phonetic research on disfluencies in spontaneous speech corpora as to define strategies for synthesizing disfluencies in a meaningful way. In this paper, our current research focus lies in an investigation of disfluency-related lengthening as a promising time-buying strategy in synthesized dialogue [1][2]. We base our analyses on the results of a search tool aiming to automatically detect lengthening in spontaneous speech corpora occurring without adjacency to phrase boundaries or other disfluencies, i.e. standalone lengthening phenomena. We analyzed disfluency-related lengthening in the ”monomodal” half of the GECO corpus [3], with regard to their context, word class, syllable position and phone type. We then postulate a disfluency insertion strategy for synthetic speech that prioritizes lengthening phenomena based on the results obtained in our study.
منابع مشابه
Disfluent Lengthening in Spontaneous Speech
We investigate lengthening in spontaneous speech with the aim in mind to use it as a time-management strategy in incremental spoken dialogue systems. lengthening is a common feature of speech, occurring regularly near the edges of intonation phrases. It behaves similar to disfluencies when it occurs in places remote from phrasal boundaries. Disfluencies have proven useful in incremental spoken ...
متن کاملAutomatic Detection of Sentence Boundaries, Disfluencies, and Conversational Fillers in Spontaneous Speech
Automatic Detection of Sentence Boundaries, Disfluencies, and Conversational Fillers in Spontaneous Speech
متن کاملDetecting Structural Metadata with Decision Trees and Transformation-Based Learning
The regular occurrence of disfluencies is a distinguishing characteristic of spontaneous speech. Detecting and removing such disfluencies can substantially improve the usefulness of spontaneous speech transcripts. This paper presents a system that detects various types of disfluencies and other structural information with cues obtained from lexical and prosodic information sources. Specifically...
متن کاملSynthesising Uncertainty: The Interplay of Vocal Effort and Hesitation Disfluencies
As synthetic voices become more flexible, and conversational systems gain more potential to adapt to the environmental and social situation, the question needs to be examined, how different modifications to the synthetic speech interact with each other and how their specific combinations influence perception. This work investigates how the vocal effort of the synthetic speech together with adde...
متن کاملConversational spontaneous speech synthesis using average voice model
This paper describes conversational spontaneous speech synthesis based on hidden Markov model (HMM). To reduce the amount of data required for model training, we utilize an average-voice-based speech synthesis framework, which has been shown to be effective for synthesizing speech with arbitrary speaker’s voice using a small amount of training data. We examine several kinds of average voice mod...
متن کامل